Search across Different Media : Numeric Data Sets and Text

نویسندگان

  • Michael Buckland
  • Aitao Chen
  • Fredric C. Gey
  • Ray R. Larson
چکیده

Digital technology encourages hope of searching across and between different media forms (text, sound, image, numeric data). We describe topic searches in two different media: text files and socio-economic numeric databases and also for transverse searching, whereby retrieved text is used to find topically related numeric data and vice versa. Direct transverse searching across different media is impossible. Descriptive metadata provides enabling infrastructure, but usually requires mappings between different vocabularies and a search term recommender system. Statistical association techniques and natural language processing can help. Searches in socio-economic numeric databases ordinarily require that place and time be specified. Search across different media: Numeric data sets and text files 2

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entry Vocabulary - a Technology to Enhance Digital Search

This paper describes a search technology which enables improved search across diverse genres of digital objects { documents, patents, cross-language retrieval, numeric data and images. The technology leverages human indexing of objects in specialized domains to provide increased accessibility to non-expert searchers. Our approach is the reverseengineer text categorization to supply mappings fro...

متن کامل

Sentiment Analysis of Surveys using both Numeric Ratings and Text Comments

Survey is a common approach for data collection mostly for the purpose of opinion analysis. There are in general two ways of analyzing people’s opinions about some items in surveys. One is based on quantitative data collected from surveys using statistical approaches, and this approach has been around for many years. The second, which is referred to as sentiment analysis, is to extract the atti...

متن کامل

The Integration of the World Wide Web and Intranet Data Resources

The explosive growth in the volume of information available on the Web and in enterprise databases continues unabated. Managing these large quantities of information remains a challenge for both government and industry. TRW’s Digital Media Systems Lab has developed a research platform, InfoWeb, that can be described as an “information infrastructure” that provides seamless access to Web search ...

متن کامل

Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval

As the major component of big data, unstructured heterogeneous multimedia content such as text, image, audio, video and 3D increasing rapidly on the Internet. User demand a new type of cross-media retrieval where user can search results across various media by submitting query of any media. Since the query and the retrieved results can be of different media, how to learn a heterogeneous metric ...

متن کامل

Cross-domain Text Classification with Multiple Domains and Disparate Label Sets

Advances in transfer learning have let go the limitations of traditional supervised machine learning algorithms for being dependent on annotated training data for training new models for every new domain. However, several applications encounter scenarios where models need to transfer/adapt across domains when the label sets vary both in terms of count of labels as well as their connotations. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006